Model Selection

End-to-end Speech Processing

# End-to-end Speech Processing

Speechless Llama3.2 V0.1

Speechless is a compact open-source text-to-semantic model (1 billion parameters) designed to directly convert audio into discrete semantic tokens without relying on traditional text-to-speech (TTS) models.

Speech Recognition Supports Multiple Languages

Speechless Llama3.2 V0.1

Speechless is a compact open-source text-to-semantic model (1 billion parameters) designed to directly convert audio into discrete semantic representation tokens without relying on traditional text-to-speech (TTS) models.

Speech Synthesis Supports Multiple Languages

Wav2vec2 Base 100k Gtzan Music Genres

Audio classification model based on Wav2Vec 2.0 architecture, specifically designed for music genre recognition

Audio Classification

An English automatic speech recognition (ASR) model fine-tuned based on microsoft/wavlm-base, trained on the english_ASR - CLEAN dataset with a word error rate (WER) of 0.0773.

Speech Recognition

anjulRajendraSharma

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase